Model Selection

Multi-dialect adaptation

# Multi-dialect adaptation

Whisper Small Ta

This model is a speech recognition model fine-tuned on the Tamil Common Voice 17.0 dataset based on OpenAI's Whisper Small, with a Word Error Rate (WER) of 43.23%.

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Hungarian

An automatic speech recognition model fine-tuned on the Hungarian Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

Whisper With Augmentation Small Arabic With Diacritics

This model is a fine-tuned version of openai/whisper-small on Arabic datasets with diacritics, supporting Arabic speech-to-text tasks with diacritical marks.

Speech Recognition

Whisper Tamil Large V2

Tamil speech recognition model fine-tuned based on OpenAI Whisper-large-v2, trained on multiple public Tamil ASR corpora

Speech Recognition Other

An Uzbek automatic speech recognition (ASR) model developed by the Oyqiz team, trained on the Common Voice 10.0 dataset

Speech Recognition

Transformers Other

Whisper Small Pashto

A Pashto (ps) speech recognition model fine-tuned based on OpenAI Whisper-small, trained on the FLEURS dataset

Speech Recognition

Transformers Other

Dansk Wav2vec21

This model is a Danish speech recognition model fine-tuned by Siyam/SKYLy on the common_voice dataset

Speech Recognition

Wav2vec2 Common Voice Tr Demo

This model is a speech recognition model fine-tuned on the Turkish Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

Sinai Voice Ar Stt

An Arabic speech recognition model fine-tuned from facebook/wav2vec2-xls-r-300m on the Common Voice Arabic dataset

Speech Recognition

Transformers Arabic

Wav2vec2 Large Xls R 300m Mongolian

An automatic speech recognition model fine-tuned on Mongolian datasets based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Spanish

A large-scale cross-lingual speech recognition model based on the Wav2Vec2 architecture, specifically optimized for Spanish, released by Facebook

Speech Recognition Spanish

Xls R Kyrgiz Cv8

This model is a fine-tuned automatic speech recognition model based on facebook/wav2vec2-xls-r-300m on the Common Voice 8.0 Kyrgyz dataset

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr Hindi Commonvoice

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice dataset, primarily used for Hindi speech recognition tasks.

Speech Recognition

Wav2vec2 Large Xlsr Tamil Commonvoice

This model is a speech recognition model fine-tuned on the Common Voice Tamil dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Xls R 300m Gn Cv8 3

An automatic speech recognition (ASR) model fine-tuned on the Guarani (gn) Common Voice 8.0 dataset based on the facebook/wav2vec2-xls-r-300m model

Speech Recognition

Transformers Other

Xls R Uyghur Cv8

An automatic speech recognition model fine-tuned on the Common Voice 8 Uyghur dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Tatar

An automatic speech recognition model fine-tuned on Tatar language based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input.

Speech Recognition Other

Wav2vec2 Speechdat

This model is a Swedish automatic speech recognition model fine-tuned on the COMMON_VOICE - SV-SE dataset based on facebook/wav2vec2-large-xlsr-53.

Speech Recognition

Wav2vec2 Large Xlsr As

This is an automatic speech recognition model fine-tuned on Assamese based on the facebook/wav2vec2-large-xlsr-53 model, trained using the Common Voice dataset.

Speech Recognition Other

Urdu automatic speech recognition model based on wav2vec2 architecture, fine-tuned on Common Voice dataset

Speech Recognition

Transformers Other

Wav2vec2 Xls R 300m W2V2 XLSR 300M YAKUT SMALL

This is a speech recognition model fine-tuned on the Yakut (Sakha) language dataset based on the facebook/wav2vec2-xls-r-300m model

Speech Recognition

Transformers Other

Wav2vec2 Tr AG V1

A Turkish speech recognition model based on the Wav2Vec2 architecture, optimized for Turkish language.

Speech Recognition

Wav2vec2 Xlsr Dhivehi

This is an automatic speech recognition (ASR) model fine-tuned on Dhivehi speech datasets based on the facebook/wav2vec2-xls-r-1b model.

Speech Recognition

Transformers Other

Wav2vec2 Xls R 300m Gn Cv8

This is an automatic speech recognition (ASR) model fine-tuned on the Common Voice 8 dataset based on the facebook/wav2vec2-xls-r-300m model, supporting Guarani (gn).

Speech Recognition

Transformers Other

This is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE - AB dataset, based on the XLS-R architecture

Speech Recognition

Transformers Other

Wav2vec2 Xlsr Breton

This model is a fine-tuned automatic speech recognition model for Breton based on facebook/wav2vec2-xls-r-1b.

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Sah CV8

A speech recognition model fine-tuned on the Common Voice Yakut dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Es

A speech recognition model fine-tuned on the Spanish Common Voice dataset based on Facebook's wav2vec2-large-xlsr-53 model, with a test WER of 10.50%.

Speech Recognition

Transformers Spanish

This is an automatic speech recognition model fine-tuned on the Common Voice Abkhaz (ab) dataset based on the XLS-R architecture

Speech Recognition

Transformers Other

This model is an automatic speech recognition (ASR) model fine-tuned on the NBAILAB/NPSC - 48K_MP3 dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Wav2vec2 Large Xls R 300m Ab V4

This is an automatic speech recognition model fine-tuned on the Abkhazian (ab) dataset based on Facebook's wav2vec2-xls-r-300m model

Speech Recognition

Transformers Other

Wav2vec2 Large Xls R 300m Br D10

This is a speech recognition model fine-tuned on Breton language dataset based on facebook/wav2vec2-xls-r-300m, achieving a 52.3% Word Error Rate (WER) on the Common Voice 8 test set.

Speech Recognition

Transformers Other

Bert Base Arabic Camelbert Mix Pos Glf

Gulf Arabic POS tagging model fine-tuned from CAMeLBERT-Mix, trained on Gumar dataset

Sequence Labeling

Transformers Arabic

Xls R 300m Ur Cv7

This model is an Urdu automatic speech recognition (ASR) model fine-tuned on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - UR dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

HarrisDePerceptron

Wav2vec2 Xls R Urdu

This model is an automatic speech recognition (ASR) model fine-tuned on the Urdu Common Voice dataset based on Facebook's Wav2Vec2-Large-XLSR-53

Speech Recognition

Transformers Other

Wav2vec2 Xls R 60 Urdu

This model is an automatic speech recognition model fine-tuned on the Common Voice Urdu dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

This is a Hausa automatic speech recognition model fine-tuned from facebook/wav2vec2-xls-r-300m, trained on the Common Voice 8.0 dataset.

Speech Recognition

Transformers Other

XLSR 1B Bokmaal Low

XLSR-1B-bokmaal-low is an automatic speech recognition (ASR) model focused on low-resource speech recognition tasks for Norwegian Bokmål.

Speech Recognition

Bert Base Arabic Camelbert Mix Pos Msa

Modern Standard Arabic POS tagging model fine-tuned on CAMeLBERT-Mix, trained using PATB dataset

Sequence Labeling

Transformers Arabic

Swahili automatic speech recognition model fine-tuned from facebook/wav2vec2-xls-r-300m, trained on Common Voice 8 dataset

Speech Recognition

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase